A Cultural Algorithm for POMDPs from Stochastic Inventory Control
نویسندگان
چکیده
Reinforcement Learning algorithms such as SARSA with an eligibility trace, and Evolutionary Computation methods such as genetic algorithms, are competing approaches to solving Partially Observable Markov Decision Processes (POMDPs) which occur in many fields of Artificial Intelligence. A powerful form of evolutionary algorithm that has not previously been applied to POMDPs is the cultural algorithm, in which evolving agents share knowledge in a belief space that is used to guide their evolution. We describe a cultural algorithm for POMDPs that hybridises SARSA with a noisy genetic algorithm, and inherits the latter’s convergence properties. Its belief space is a common set of state-action values that are updated during genetic exploration, and conversely used to modify chromosomes. We use it to solve problems from stochastic inventory control by finding memoryless policies for nondeterministic POMDPs. Neither SARSA nor the genetic algorithm dominates the other on these problems, but the cultural algorithm outperforms the genetic algorithm, and on highly non-Markovian instances also outperforms SARSA.
منابع مشابه
Particle Filtering for Stochastic Control and Global Optimization
Title of dissertation: PARTICLE FILTERING FOR STOCHASTIC CONTROL AND GLOBAL OPTIMIZATION Enlu Zhou, Doctor of Philosophy, 2009 Dissertation directed by: Professor Steven I. Marcus Department of Electrical and Computer Engineering Professor Michael C. Fu Department of Decision, Operations, and Information Technologies This thesis explores new algorithms and results in stochastic control and glob...
متن کاملEfficiency of a multi-objective imperialist competitive algorithm: A bi-objective location-routing-inventory problem with probabilistic routes
An integrated model considers all parameters and elements of different deficiencies in one problem. This paper presents a new integrated model of a supply chain that simultaneously considers facility location, vehicle routing and inventory control problems as well as their interactions in one problem, called location-routing-inventory (LRI) problem. This model also considers stochastic demands ...
متن کاملA Particle Filtering Algorithm for Interactive POMDPs
Interactive POMDP (I-POMDP) is a stochastic optimization framework for sequential planning in multiagent settings. It represents a direct generalization of POMDPs to multiagent cases. Expectedly, I-POMDPs also suffer from a high computational complexity, thereby motivating approximation schemes. In this paper, we propose using a particle filtering algorithm for approximating the I-POMDP belief ...
متن کاملApplication of queuing theory in production-inventory optimization
This paper presents a mathematical model for an inventory control system in which customers’ demands and suppliers’ service time are considered as stochastic parameters. The proposed problem is solved through queuing theory for a single item. In this case, transitional probabilities are calculated in steady state. Afterward, the model is extended to the case of multi-item inventory systems. The...
متن کاملDeveloping a new stochastic competitive model regarding inventory and price
Within the competition in today’s business environment, the design of supply chains becomes more complex than before. This paper deals with the retailer’s location problem when customers choose their vendors, and inventory costs have been considered for retailers. In a competitive location problem, price and location of facilities affect demands of customers; consequently, simultaneous optimiza...
متن کامل